National Repository of Grey Literature 27 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Determination of basic form of words
Šanda, Pavel ; Burget, Radim (referee) ; Karásek, Jan (advisor)
Lemmatization is an important preprocessing step for many applications of text mining. Lemmatization process is similar to the stemming process, with the difference that determines not only the word stem, but it´s trying to determines the basic form of the word using the methods Brute Force and Suffix Stripping. The main aim of this paper is to present methods for algorithmic improvements Czech lemmatization. The created training set of data are content of this paper and can be freely used for student and academic works dealing with similar problematics.
Slovak Pattern-based Morphology
Klocok, Andrej ; Dytrych, Jaroslav (referee) ; Smrž, Pavel (advisor)
Theaimofthisthesisistogetacquaintedwithmethodsofmorphologicalanalysis,representation of data of morphological dictionaries, creation of system based on technical patterns for flective morphology of Slovak language. From this system is derived a morphological analyzer, which lemmatizes input words, determines their pattern and a morphological tag, a tool for comparison and evaluation of stemmers, which evaluates stemmers based on a derivative dictionary, a tool for reconstruction of diacritics, which was created as an auxiliary tool. In the last chapters of thesis, individual tools are assessed, morphological analyzer is compared with available alternative,two implementations of Slovak stemmers are evaluated by the tool for stemmer evaluation and the further development of tools is indicated.
Recognition and classification of emotions based on analyzing text messages
Buday, Ondřej ; Burget, Radim (referee) ; Smékal, Zdeněk (advisor)
Main objective of this graduation thesis is clarification of informations about human emontions and its recognition and classification on basis of generally known methods. Among generally known methods belongs classification based on mimic and pantomimic expression basis and on voice tone basis. This graduation thesis is primary focused on human emotions recoginition and classification on text messages analysis. Specifically anger and joy. A Czech database in lemma form was crated for this method and it contains emotion functional words. It was made by translation of English emotion functional words and it is divided to emotion key words, emotion modification words and emotion phrases. Also was made special database of neutral words, which are removed from emotions classification except emotion phrases comparation. A degrees were assigned to emotion key words and phrases and this assignment was made by more people. That guarantee better objectivity of the database. Values of single emotion words and phrases might be influenced by emotion modification words, words that contains gradation, negation or punctuation. Output of this graduation thesis is application in JAVA language. That application after insertion input text message compare single words with all available databases and count resultant emotion rate for joy and anger emotions. Evaluation involve also single words, sentences and paragraphs. Application was designed in NetBeans with Swing GUI Builder.
Parallel Text Alignment
Kadlček, Filip ; Grézl, František (referee) ; Smrž, Pavel (advisor)
This thesis is concerned to align parallel corpus. In the first part of thesis are describe acceses to align and some tool to align. As first describe a statistical align, but the main part is specialize to align with use dictionary, which is the main part of this thesis. In the midle part is introduce the princip of dictionary align and a simple example of align. At the end of work are sumarize obtained results and are noted proposals for future develop.
Web as a Source for Automatic Creation of Morphological Dictionary
Bulka, Pavol ; Matějka, Pavel (referee) ; Smrž, Pavel (advisor)
Creation of natural language words is based on rules, which are generally complex. Often it is very difficult or even impossible to describe them precisely in a formal way. That is why we use a morpho­logical dictionary to process natural language. In this paper we discuss the creation of morphological dictionary from Slovak's top level domain web. We talk about web crawling, data processing for mor­phological analysis and data structures too. This document makes basic principle and conception of morphological analysis clear. Final system, which is described in this thesis, produces morphological dictionary. This dictionary can be use in various application, for example spell checker, machine translation and so on.
Slovak Lemmatization
Lipták, Šimon ; Dytrych, Jaroslav (referee) ; Smrž, Pavel (advisor)
Aim of this bachelor thesis was to become familiar with the tools and methods for morphological analysis and lemmatization of words, to design and to implement a system for lemmatization of slovak words, which are not in dictionary and then to write their forms, to process slovak data for implementation of stemming. At the end to score prediction based on testing and to compare with available alternatives.
Terminology in Visual Arts in a large bilingual dictionary
Kučerová, Daniela ; Vachková, Marie (advisor) ; Šemelík, Martin (referee)
This given thesis addresses the terminology of Visual Arts in the large bilingual dictionary and provides a reflection on the problem of terminology. The thesis is divided into two parts: theoretical and practical. The theoretical part gives definitions of language for special purposes, language for general purposes, term, terminology, synonymy, equivalence or terminological dictionary. The second part of the thesis analyses the given dictionary entries of the Visual Arts from the linguistic, terminological and lexicographical point of view in the large bilingual dictionary.
On Sense Division in a Bilingual Dictionary
Hagenhoferová, Lucie ; Vachková, Marie (advisor) ; Šemelík, Martin (referee)
This thesis deals with the dividing the dictionary entry into several "sub-meanings", i.e. with the sense division, and with the closely related ordering of these "sub-meanings", i.e. with the sense ordering, with the bilingual passive German-Czech dictionary in the centre of interest. This thesis deals shortly also with the discriminating the senses by different means, i.e. with the sense discrimination. With these three subjects chronologically following the lexicographic decisions the matter of equivalence and the understanding of the term of meaning are correlated. In the theoretical part of this thesis the specifics of the sense division and the sense ordering in the monolingual and in the bilingual lexicography are introduced, in the practical part of this thesis the possibilities of the sense division and the sense ordering are exemplified with ten chosen substantive lemmas prepared for the Large German-Czech Academic Dictionary in progress. For every lemma the most suitable arrangement of the lemma is suggested, which is then compared with the corresponding dictionary entry in the source dictionary Duden - Deutsches Universalwörterbuch. The differences between the arrangements of the microstructure illustrate the necessity of the revision and eventual modification of the adopted structure...
Competing word-formation suffixes for the indication of female gender in professions
Kožuriková, Daniela ; Bozděchová, Ivana (advisor) ; Mareš, Petr (referee)
The bachelor thesis Competing word-formation suffixes for the indication of female gender in professions is devided into two parts. The first part summarizes findings provided by linguistic literature on this topic. Suffixes used most frequently are also discussed. The second part provides results of practical research during which the names of feminine professions, found in the dictionary Nová slova v češtině 1 a 2, were entered in the form of lemmas into the corpuses SYN 2000 and SYN 2005, and their frequency was investigated. In the end of the thesis there is a practical dictionary in which meanings of some lesser known proffesions designating substantives are explained.
Syntax of Adjective Compounds in a Bilingual Dictionary
Schmidtová, Iva ; Vachková, Marie (advisor) ; Hejhalová, Věra (referee)
The aim of the given thesis is to expound the problem of syntax processing by creating adjectival compounds in dictionary entries for the Large German-Czech Academic Dictionary. Assuming that exactly these words, which belong to the complicated vocabulary items, cause difficulties for foreign speakers with text comprehension and text production. It is based on the nature of German language, which makes many compounds, as opposed to in Czech language. Not only are the meanings of these words revealed to be problematic, but also their usage. The thesis is divided into two parts. The theoretical part deals with characteristics of adjectives as word class, and because each word always has its context, attention will be primarily focused on the adjectival valency. Previously, only the valency of verbs and substantives has been discussed. When a native speaker automatically knows these syntax structures and even when, in case of foreign language, there could exist structural similarities between the source and target language, the structures ought consequently to be given in the dictionary. In some mono- or bilingual dictionary the user should find complex informations for word searched. The practical part involves analysis of ten chosen adjective compounds, which shows how the valency structures are...

National Repository of Grey Literature : 27 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.